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Section 8 

Convergence of Laws. Selection 
Theorem. 

In this section we will begin the discussion of weak convergence of distributions on metric spaces. Let {S, d) 
be a metric space with a metric d. Consider a measurable space {S, B) with Borel (x-algebra B generated by 
open sets and let (^n)n>i and P be probability distributions on B. We define 

Cb{S) = {/ : 5 — > ffi - continuous and bounded}. 

We say that P„ ^ P weakly if 

J fdPn ^ J fdF for all / € Cb(S). 

Theorem 18 IfS = R then P„ ^ P i# 

i^n(i) =Pn((-oo,i]) ^F(t) =P((-oo,t]) 

for any point of continuity t of F{t). 

Proof. "=>" Let us approximate an indicator function by a continuous functions as in figure 8.1, i.e. 

ipi{X) < \{x <t)< ip2{X), ipi,ip2 e Cb{M.). 

For convenience of notations, instead of writing integrals w.r.t. P„ we will write expectations of a r.v. X„ 
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Figure 8.1: Approximating indicator. 

with distribution P„. 

P(X <t-s)< Eipi{X) ^ E^Pl{Xr^) < Fn{t) = P(X„ < i) < E(p2(^„) ^ Eip2{X) < ¥{X <t + e) 
as n — > oo. Therefore, for any £ > 0, 

F{t-s) < liminf < limsupF„(t) < F(t + e). 
Since t is a point of continuity of F, letting e ^ proves the result. 
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"-<=" Let PC{F) be the set of points of continuity of F. Since F is monotone, the set PC{F) is dense 
in R. Take M large enough such that both M, -M e PC{F) and P([-M, Mf) < e. Clearly, for large enough 
k we have Pfe([— M, M]") < 2e. For any n > 1, take a sequence of points 

-M = x'l<x^<---<xl = M 

such that all Xi G PC{F) and maxj \x^^i — x"| — » as n ^ oo. Given a function / G C(,(M), consider an 
approximating function 

Ux)= J2 f{xi)l{xG{xti,x7]) + 0-l{x^[-M,M]). 

l<i<n 

Since / in continuous, 

sup \fn{x) - f{x)\ < Sn ^ 0, U ^ CO. 

\x\<M 

Since all a;" are in PC{F) we can write 

l<i<n l<i<n 

Also, 

E/(X)-E/„(X)| < \\f\\^¥iX ^[-M,M]) + Sn < ||/||oo£ + 5n 

and, similarly, 

E/(Xfc)-E/„(Xfe)| < ||/|UP(^fc^ [-M,M]) + ^„< ||/|U2£ + (5„. 
Letting k ^ oo, e ^ and n ^ oo proves that Ef{Xk) E/(X). 

□ 

We say that a sequence of distributions (Pn)n>i on a metric space (S, d) is uniformly tight if for any e > 
there exists a compact K (~ S such that Vn{K) > 1 — £ for all n. Our next goal is to prove the following 
theorem. 

Theorem 19 (Selection theorem) If (Pn)ri>i is a uniformly tight sequence of laws on the metric space {S, d) 
then there exists a subsequence (n(fc)) such that P„(fc) converges weakly to some probability law P. 

We will prove the Selection Theorem for arbitrary metric spaces, since this result will be useful to us later 
when we study the convergence of laws on general metric spaces. However, when 5 = R*^ one can proceed in 
a more intuitive way, based on the following Lemma. 

Lemma 12 (Cantor's diagonalization) Let A be a countable C S and /„ : 5 — > M, n > 1. Then there exists 
a subsequence {n{k)) such that /n(fe)(a) converges for all a G A (if fn{k){o) is unbounded, maybe to ±oo). 

Proof. Let A = {ai,a2, ■ ■ ■}■ Take (n^(fc)) such that converges. Take (n^(fc)) C (n^(fc)) such that 

In^{k){^2) converges. By induction, take (n' (k)) C (r?/^-'(fc)) such that fn'(k){'^i) converges. Now consider a 
sequence (n*^(A;)). Clearly, /„ic(fc)(a() converges for any I because for k > l,n''{k) £ {n'(fc)} by construction. 

□ 

Define a joint c.d.f. on M.'' by 

F{t)=¥{Xi<tu...,Xk<tk) where t = {ti, . . . ,tk). 

Let ^ be a dense set of points in R*^. By Lemma, there exists a subsequence {n{k)) such that (a) — » F{a) 
for all a G A. For x €:'MJ'\Awe can extend F by 

F{x) = inf{i^(a) : a e A,Xi < aj. 
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F{x) is a c.d.f. on M*^ (exercise). The fact that P„ are uniformly tight ensures that F{x) ^ or 1 if all 
Xi —00 or +00. Let x be a point of continuity of F{x) and let a,b € A such that Ui < Xi < hi for all i. We 
have, 

F{a) ^ F„(fc)(a) < F„(fc)(a;) < F„(fc)(6) ^ F{b) 
as fc — > oo. Since a; is a point of continuity and A is dense, 

Fia)"-^ F{x), F{b)''-^ F{x), 

and this proves that F{x) for all such x. Similarly to one-dimensional case one can show that 

for any / e Cbi^''), 

J fdFr^^k) ^ J fdF. 

Proof of Theorem 19. If is a compact then Cb{K) = C{K). Later in these lectures, when we deal in 
more detail with convergence on general metric spaces, we will prove the following fact which is well-known 
and is a consequence of the Stone- Weierstrass theorem. 

Fact. C{K) is separable w.r.t. £oo norm ||/||oo = sup^.^^^ \f{x\- 

Even though we are proving Selection theorem for a general metric space, right now we are mostly 
interested in the case S = where this fact is a simple consequence of the Weierstrass theorem that any 
continuous function can be approximated by polynomials. 

Since P„ are uniformly tight, for any r > 1 we can find a compact Kr such that Fn{Kr) > 1 — ^. Let 
Cr C C{Kr) be a countable and dense subset of C{Kr). By Cantor's diagonalization argument there exists 
a subsequence (n(fc)) such that Pn{k){f) converges for all f G Cr for all r > 1. Since Cr is dense in C{Kr) 
this implies that P„(A;)(/) converges for all / e C{Kr) for all r > 1. Next, for any / e Ch{S), 

/ /rfPn(fe) < / \f\d^n(k) < ll/ll m < 

This implies that the limit 



/(/) := lim / /dP„(fe) (8.0.1) 

/c^oo J 

exists. The question is why this limit is an integral over some probability measure P? On each of the compacts 
Kr we could use Riesz's representation theorem for continuous functional on C{Kr) and then extend this 
representation to the union of Kr- Instead, we will prove this as a consequence of a more general result, the 
Stone-Daniell theorem from measure theory, which says the following. 
A family of function a = {/ : 5 ^ M} is called a vector lattice if 

/, .g e a =4> c/ + g e a, Vc G M and f A g, f V g G a. 

A functional / : a ^ R is called a pre-integral if 

1. 7(c/ + ff) = c/(/) + 7(5), 

2. />0,7(/) >0, 

3. /„iO,||/„|U<oo^/(/„)^0. 

Theorem 20 (Stone-Daniell) If a is a vector lattice and I is a pre-integral on a then I{f) = J fdji for some 
unique measure ^ on a-algebra generated by functions in a (i.e. minimal a -algebra on which all functions in 
a are measurable). 

We will use this theorem with a = Cb{S) and / defiued in (8.0.1). The first two properties are obvious. To 
prove the third one let us consider a sequence such that 

/„iO, < /„(a;) < /i(a;) < ll/ilU. 
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On any compact Kr, fn i uniformly, i.e. 

||/n||cx),ifr < ^n,r " — ^ 0. 

Since ^ 



[ fnd^n{k) = [ fndFn{k) < ^n,r + "ll/l 



we get 



fe— >oo J ' r 

Letting n ^ oo and r — > oo we get that I{fn) — > 0. By the Stone-Daniell theorem 

/(/) = j fdF 

for some measure on a{Cb{S)). The choice of / = 1 gives /(/) = 1 = F{S) which means that P is a 
probability measure. Finally, let us show that cr(Cfc(S')) ~ B - Borel cr-algebra generated by open sets. Since 
any / G Cb{S) is measurable on B we get a{Cb{S)) C B. On the other hand, let F C 5 be any closed set 
and take a function f{x) = min(l, rf(a;, F)). We have, \ f{x) — f{y)\ < d{x,y) so f G Ch{S) and 

f-H{0})GamS)). 

However, since F is closed, /~^({0}) = {x : d{x,F) = 0} = F and this proves that B C a{Cb{S)). 

□ 

Theorem 21 //P„ converges weakly to P on R'^ then (Pn)n>i is uniformly tight. 

Proof. For any e > there exists large enough M > 0, such that P(|a;| > M) < e. Consider a function 

r 0, s<M, 

a{s) = \ 1, s>2M, 

[ U^-M), M<s<2M. 

and let a{x) := a{\x\) for x e M.''. Since P„ ^ P weakly, 

j a{x)d¥n j a{x)dF. 

This implies that 

P„(^|a;| > 2m) < y a{x)dFri ^ j a{x)dP < f(Jx\ > < e. 

For n large enough, n > uq, we get P,i(|a;| > 2M) < 2e. For n < uq choose M„ so that Pn(|a;| > M„) < 2e. 
Take M' = max{Mi, . . . , M„o_i, 2M}. As a result, P„(|a:| > M') < 2e for all n > 1. 

□ 

Lemma 13 If for any sequence {n{k))k>i there exists a subsequence {n{k{r)))r>i such that P„(fe(r)) P 
weakly then P„ — > P weakly. 

Proof. Suppose not. Then for some / G Cb{S) and for some e > there exists a subsequence (n(fc)) such 
that 

J fdPn^k)- J fdP\>e. 
But this contradicts the fact that for some subsequence P„(fc(,)) P weakly. 

□ 

Consider r.v.s X and X„ on some probability space {Q, A,F) with values in a metric space {S, d). Let P and 
P„ be their corresponding laws on Borel sets B in S. Convergence of X„ to X in probability and almost 
surely is defined exactly the same way as for S = M by replacing |X„ — X\ with d{Xn,X). 
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Lemma 14 Xn X in probability iff for any sequence {n{k)) there exists a subsequence (n(fc(r))) such 
that ^ra(;j(r)) — ^ X a.s. 

Proof. " . Suppose X„ docs not converge to X in probability, Then for small enough e > there exists 

a subsequence {n{k)) such that 

This contradicts the existence of subsequence X„(/c(r)) that converges to X a.s. 
"=>". Given a subsequence {n{k)) let us choose (fc(r)) so that 

p(rf(X„(fe(.)),X)>i)<l 

By Borel-Cantelh lemma, these events can occur i.o. with probability 0, which means that with probability 
one for large enough r 

d{Xn{k(r)),X) < ^, 

i.e. -'^„(fe(r.)) — * X a.s. 

□ 

Lemma 15 X„ X in probability then X„ — > X weakly. 

Proof. By Lemma 14, for any subsequence {n{k)) there exists a subsequence {n{k{r))) such that Xn(kir)) ~* 
X a.s. Given / e Cb{M.), by dominated convergence theorem, 

^f{Xn{k{r))) 

i.e. ^„(fc(r)) — > ^ weakly. By Lemma 13, X„ X weakly. 

□ 
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